Hacker News new | ask | show | jobs
by BVCommander 1473 days ago
Nothing reproduces the original signal, it's distorted by the inertia and impedance of the microphone and amplifier that recorded it.

As you know it's then passed through an ADC and stored as a sine wave, cause no one is mastering inaudible square waves on a reel for kitsch value.

1 comments

Agreed - and after quantization nothing reproduces the signal before quantization. So claiming Nyquist-Shannon proves the OP wrong is incorrect.

>It's then passed through an ADC and stored as a sine wave.

After an ADC it is not stored as a sine wave. It's stored as quantized values, thus the 'D' in Analog-to-Digital-Converter.

>cause no one is mastering inaudible square waves on a reel for kitsch value

Pretty much all audio processing is now done digitally, which is the same as square waves - each jump in discrete digital value is a step function. When you push it through properly engineered output devices the squareness is smoothed somewhat, but still has frequency ringing because it is square edged.

Take a good speaker, take something that can grab audio spectrum far beyond audible, and look at the output. There is stuff far outside human hearing coming from the speaker because of these square waves. Naively, this is because the Fourier transform of the square waves have high frequency ringing, and this is because the playback has sharp edges. (See this [1] for some related info for example).

Also Nyquist-Shannon is about frequencies, not about amplitudes, which are also quantized. Physical devices making sound have a (up to quantum level) continuum of possible amplitudes. Quantization necessarily loses this forever. For example, take A0, the lowest standard piano note, freq ~27.5. Sample a pure sine wave at 60 hz, in 8 bit audio. Now record this tone going from no sound up to very loud, very smoothly, over some time. The 8-bit audio will necessarily have less smoothness to it, since it is 8 bit audio. It perfectly matched your Nyquist-Shannon claim, yet it fails to reproduce what you hear. Take 16 bit audio - better. Take 32-bit or floating point audio, better again. And so on.

I agree that really well engineered systems can push the errors outside human hearing, but to claim they reproduce the same signals is incorrect.

The gist of this is: to get the most accurate reproduction of the original, merely sampling at 2x the top human freq is no where near state of the art.

An counter-intuitive example: to get the best quality and most accurate output, one needs to add noise to the input. The reason is that due to quantization, if some input signal is between possible quantized output values, adding noise (usually Gaussian, of std dev ~sqrt(step size)) makes that signal trigger both high and low quantized values in proportion to the intermediate value, making the output playback the step square waves as close to approximating the original as possible. The entire field is full of stuff like this.

For reference, I've worked on stuff like this on and off for decades, having written libraries used by others, designed high end audio simulation software (think raytracer for audio in physical settings to help design stadiums), written articles, and produced hardware in a company I own. I am quite familiar with all sorts of audio processing.

[1] https://electronics.stackexchange.com/questions/156197/can-c...

> which is the same as square waves ... There is stuff far outside human hearing coming from the speaker because of these square waves.

Maybe back in the 1980s on some of the early consumer digital equipment; but those problems were solved in the early 1990s by oversampling in the DAC, and then using some basic analog filtering far above the human hearing range.

IE, a consumer DAC will oversample a 44.1khz signal to (example) 705.6khz in the digital domain; and then use a very gentle analog lowpass filter to deal with the ultrasonic distortion. At that point the difference between the original analog signal and the one coming from the DAC is approximately as accurate as if there was no DAC in the first place. (Granted, some people can hear up to 27khz, which is why some people like 96khz sampling rates.)

>but those problems were solved in the early 1990s by oversampling in the DAC, and then using some basic analog filtering far above the human hearing range.

You're writing about the sampling end. I said the physical speaker creates high frequencies on playback based on material properties of the device - and I gave a decent reference where you can read the discussion on it.

No amount of filtering at the sampling end will remove physically created noise due to the physical playback membrane that moves air to create sound waves.

I am fully aware of using bandpass filters during sampling. I use them all the time to remove things I don't need before doing things like wavelet transforms to pull music information out of the result. And I often design things up front based on the physical playback mechanism if I know it ahead of time. Or if the hardware (such as embedded devices) will only sample at certain rates, or certain bit depths. Knowing as much about the entire audio path up front helps design each and every piece of the complete signal path.

Here's a simple example: basic speakers are an electromagnet coil - apply voltage V and the membrane jumps to a position. Different values for V make different positions.

Quantized playback, going through an DAC, will create distinct voltage levels. 8 bits will give 256 such levels. 16 bits, 65536 levels.

When that hits a speaker, the speaker membrane jumps to that level. There is some noise with inertia and momentum and point to point, but the end effect is the same - the speaker trying to make a square wave edge. There is no uniformly smooth movement from position to position - only jumps.

This can be seen by putting a mirror on the speaker, and bouncing a laser off it to a large wall, and record the wall in high speed - you see jumpy movement. Fiddle sometime with a pure tone sent at various bit depths to a speaker and watch the laser.

Now, these movements create frequencies in output not in the original analog signal, not in the digital signal, but purely as a physical artifact. And they depend on the playback device - all sorts of work and research is spent on speaker tech, materials, reproducible construction, and on and on, to make the output physical waveform as uniform and smooth as possible over all the possible input voltage jumps and frequencies desired. But all are imperfect, similarly to how all physical lenses (well, except 1-1 and flips) must distort images. It is all about the tradeoffs.

Up to the Nyquist limit, a digital signal will completely recreate the original signal, with no square wave steps. Digitisation does not result in square wave output anywhere in the output chain.

Chris Montgomery (of Ogg / Speech / xiph.org & RedHat) did a series of videos going into this in considerable depth. I encourage you to watch them.

https://xiph.org/video/

I am fully aware of those videos and claims. Did you read my posts or the links? What I posted goes vastly beyond what Chris Montgomery wrote, and far beyond his claims.

Again, in different words:

Nyquist assumes infinitely precise samples. It's math, not computer sampling. This never happens, since samples are quantized. Having samples at the proper number of Hz is useless unless enough precision is there, and non-infinite precision implies the original signal is never reconstructable. We're dealing with computers, not the real numbers.

Take a pure sine wave. Sample it mathematically. Quantize those values. Now what sin value reproduces those quantized values? None. Never. They are rational numbers - it is mathematically impossible to fit a single sine wave to them, since sine is a transcendental function. End of story.

Sine is a transcendental function, so rational inputs (other than 0 in the case of sin) do not (except for input 0 for sin) give rational output. So you cannot sample it to perfection with a digital device. You can approximate it. That approximation matters. Digital sampling takes rational input deltas (sampling rate) and necessarily obtains imperfect samples, since you quantized the actual value of a sine wave. So Nyquist fails.

Yes, for a bandlimited signal of a given frequency, Nyquist lets you reconstruct that frequency given infinite precision. This NEVER happens in practice, since it assumes infinitely precise samples. Montgomery ignores this (and a host of other issues - he's at level 2 of a 100 level tower. People at level 0 see his videos and assume there are only 2 levels to the tower). Bitdepth matters. Nyquist does nothing about amplitude quantization, which is needed. It ignores the path to reconstruction - Nyquist only applies to a perfect (not floating point or integer) reconstruction of the signal. Nyquist does not deal with the fact that the quantized values, when pushed to any physical device used to reconstruct audio, is more like stairsteps than pure sine waves.

Most physical devices performing playback are more stairstep than smooth sine values, so they are not reconstructing the input signal - another issue that matters. Input signals are (nearly, up to quantum level) infinitely precise in amplitudes - output devices tend to be more quantized.

Please read the thread I wrote and think through it. I posted a link to a good discussion, I posted a simple experiment or two you can do, I posted (here) a simple mathematical exercise showing that Nyquist fails for this.

1. Montgomery talks about quantization noise. He doesn’t assume infinite precision.

2. DAC output is not stairstep. There’s an analog reconstruction filter that filters out the ultrasonic components, thereby getting back the original smooth waveform. You should read up on DAC design.

>Montgomery talks about quantization noise. He doesn’t assume infinite precision.

In [0], at time 23:50, he states that there is only one band limited signal that passes through each sample point - this requires infinite precision for reasons I explained elsewhere. It's a theoretical idealization that makes math easier, just like frictionless physics, using the ideal gas law, approximating sin(x) by x for small x, and so on. It's nice, but it's not what happens in practice.

> DAC output is not stairstep. There’s an analog reconstruction filter that filters out the ultrasonic components, thereby getting back the original smooth waveform. You should read up on DAC design.

I wrote that the physical playback device, such as a speaker, adds ultrasonic noise due to simple physics. If this were not the case, there'd be very little need for such variety in speaker costs - they'd all just magically reproduce the perfect waveform. But they don't.

As to DACs, they most certainly do not work as you claim - and it's demonstrably impossible as I explained since sine is a transcendental function, and you lost information needed since you don't have infinite precision samples.

Let's pick a common DAC, say an Analog Devices AD5780, datasheet here [1]. Page 18 has the circuit diagram for - a resistor bank. That's a stairstep (minus some physical noise at the transitions). If you look over the previous pages (Vout is the out signal you want to look at), it clearly outputs a fixed, discrete voltage for a fixed input. Every common chip does this.

Care to point to a chip that guarantees "getting back the original smooth waveform"? I'd love to see the datasheet on such a device.

One can design or buy DACs for $500, $1000, and more, but these are not what most people use. And even these exist in such variety because, as you guessed it, they cannot reproduce perfectly original waveforms, otherwise there'd be no need for such cost or variety. They all make tradeoffs and assumptions to cater to specific needs. Sure, they are very good, but they don't reproduce "the original smooth waveform".

As to analog filters providing magic, they too are not what you think - they're making assumptions and deviate from perfection with tradeoffs. I'd guess Analog Devices engineers can say it better than I : "A reconstruction filter is used at the output of the DAC to attenuate image frequencies. However, a physical filter cannot be implemented with ideal stop band rejection extending out to infinite frequency. This is due to component parasitic effects as well as the physical limitations of printed circuit board layout" [2].

The perfect filter for the output is a sinc (yes, with a 'c'). Anything else simply is not the output you desire. But the problem with sinc is it has infinite support, so is not usable in practice. Filtering on a DAC is therefore a finite support approximation to the sinc filter, and introduces error. Read here [3] for more, or look at a textbook on it. [3] also explains quite clearly that in reality none of this ends up exact as you claim.

Your claims are idealizations that are not met in reality.

"You should read up on DAC design". Indeed.

So, have the spec sheet for this perfect reconstruction DAC you claim exists? I'd like to see one.

[0] https://www.youtube.com/watch?v=cIQ9IXSUzuM

[1] https://www.analog.com/media/en/technical-documentation/data...

[2] https://www.analog.com/media/en/technical-documentation/appl...

[3] https://en.wikipedia.org/wiki/Reconstruction_filter

1. Noone said that the anti-imaging filter was literally built into the DAC package.

2. The analog reconstruction filter doesn’t need ideal stopband rejection. An oversampling DAC [0] pushes the image frequencies far beyond the passband, so a gentle analog filter is sufficient to suppress it to the noise floor.

3. Noone said anything about mathematically perfect reproduction. Of course there is quantization noise. Of course there is clock jitter. And so on. But the cumulative effect of these is still way below the detectability threshold of the human ear. And the noise floor of a digital system is still way lower than what’s achievable with an analog one.

[0] https://www.analog.com/media/en/training-seminars/tutorials/...