Hacker News new | ask | show | jobs
by BVCommander 1478 days ago
>Digital is cold and thin compared to analog.

Claude Shannon emphatically disagrees. When sampling at twice the highest audible frequency they are mathematically equivalent.

5 comments

Not quite - Nyquist-Shannon's theorem states the original signal can be reconstructed, but it usually isn't in reproduction. Digital playback does not reproduce the full original signal - it plays it back on whatever output device it has, which does not generally recreate the original waveform.

That theorem also has some technical constraints that actual noise does not match, such as band limited and Fourier transforms that are zero outside some bound. Actual music does not match those. It can be approximated with those.

For example, suppose you have a pure sine wave, sample it with enough density to make it mathematically reproducible, then play back those quantized samples on a piezo making square waves - it sounds pretty good (and can be indistinguishable to most ears), but it is not the same waveform.

Nothing reproduces the original signal, it's distorted by the inertia and impedance of the microphone and amplifier that recorded it.

As you know it's then passed through an ADC and stored as a sine wave, cause no one is mastering inaudible square waves on a reel for kitsch value.

Agreed - and after quantization nothing reproduces the signal before quantization. So claiming Nyquist-Shannon proves the OP wrong is incorrect.

>It's then passed through an ADC and stored as a sine wave.

After an ADC it is not stored as a sine wave. It's stored as quantized values, thus the 'D' in Analog-to-Digital-Converter.

>cause no one is mastering inaudible square waves on a reel for kitsch value

Pretty much all audio processing is now done digitally, which is the same as square waves - each jump in discrete digital value is a step function. When you push it through properly engineered output devices the squareness is smoothed somewhat, but still has frequency ringing because it is square edged.

Take a good speaker, take something that can grab audio spectrum far beyond audible, and look at the output. There is stuff far outside human hearing coming from the speaker because of these square waves. Naively, this is because the Fourier transform of the square waves have high frequency ringing, and this is because the playback has sharp edges. (See this [1] for some related info for example).

Also Nyquist-Shannon is about frequencies, not about amplitudes, which are also quantized. Physical devices making sound have a (up to quantum level) continuum of possible amplitudes. Quantization necessarily loses this forever. For example, take A0, the lowest standard piano note, freq ~27.5. Sample a pure sine wave at 60 hz, in 8 bit audio. Now record this tone going from no sound up to very loud, very smoothly, over some time. The 8-bit audio will necessarily have less smoothness to it, since it is 8 bit audio. It perfectly matched your Nyquist-Shannon claim, yet it fails to reproduce what you hear. Take 16 bit audio - better. Take 32-bit or floating point audio, better again. And so on.

I agree that really well engineered systems can push the errors outside human hearing, but to claim they reproduce the same signals is incorrect.

The gist of this is: to get the most accurate reproduction of the original, merely sampling at 2x the top human freq is no where near state of the art.

An counter-intuitive example: to get the best quality and most accurate output, one needs to add noise to the input. The reason is that due to quantization, if some input signal is between possible quantized output values, adding noise (usually Gaussian, of std dev ~sqrt(step size)) makes that signal trigger both high and low quantized values in proportion to the intermediate value, making the output playback the step square waves as close to approximating the original as possible. The entire field is full of stuff like this.

For reference, I've worked on stuff like this on and off for decades, having written libraries used by others, designed high end audio simulation software (think raytracer for audio in physical settings to help design stadiums), written articles, and produced hardware in a company I own. I am quite familiar with all sorts of audio processing.

[1] https://electronics.stackexchange.com/questions/156197/can-c...

> which is the same as square waves ... There is stuff far outside human hearing coming from the speaker because of these square waves.

Maybe back in the 1980s on some of the early consumer digital equipment; but those problems were solved in the early 1990s by oversampling in the DAC, and then using some basic analog filtering far above the human hearing range.

IE, a consumer DAC will oversample a 44.1khz signal to (example) 705.6khz in the digital domain; and then use a very gentle analog lowpass filter to deal with the ultrasonic distortion. At that point the difference between the original analog signal and the one coming from the DAC is approximately as accurate as if there was no DAC in the first place. (Granted, some people can hear up to 27khz, which is why some people like 96khz sampling rates.)

>but those problems were solved in the early 1990s by oversampling in the DAC, and then using some basic analog filtering far above the human hearing range.

You're writing about the sampling end. I said the physical speaker creates high frequencies on playback based on material properties of the device - and I gave a decent reference where you can read the discussion on it.

No amount of filtering at the sampling end will remove physically created noise due to the physical playback membrane that moves air to create sound waves.

I am fully aware of using bandpass filters during sampling. I use them all the time to remove things I don't need before doing things like wavelet transforms to pull music information out of the result. And I often design things up front based on the physical playback mechanism if I know it ahead of time. Or if the hardware (such as embedded devices) will only sample at certain rates, or certain bit depths. Knowing as much about the entire audio path up front helps design each and every piece of the complete signal path.

Here's a simple example: basic speakers are an electromagnet coil - apply voltage V and the membrane jumps to a position. Different values for V make different positions.

Quantized playback, going through an DAC, will create distinct voltage levels. 8 bits will give 256 such levels. 16 bits, 65536 levels.

When that hits a speaker, the speaker membrane jumps to that level. There is some noise with inertia and momentum and point to point, but the end effect is the same - the speaker trying to make a square wave edge. There is no uniformly smooth movement from position to position - only jumps.

This can be seen by putting a mirror on the speaker, and bouncing a laser off it to a large wall, and record the wall in high speed - you see jumpy movement. Fiddle sometime with a pure tone sent at various bit depths to a speaker and watch the laser.

Now, these movements create frequencies in output not in the original analog signal, not in the digital signal, but purely as a physical artifact. And they depend on the playback device - all sorts of work and research is spent on speaker tech, materials, reproducible construction, and on and on, to make the output physical waveform as uniform and smooth as possible over all the possible input voltage jumps and frequencies desired. But all are imperfect, similarly to how all physical lenses (well, except 1-1 and flips) must distort images. It is all about the tradeoffs.

Up to the Nyquist limit, a digital signal will completely recreate the original signal, with no square wave steps. Digitisation does not result in square wave output anywhere in the output chain.

Chris Montgomery (of Ogg / Speech / xiph.org & RedHat) did a series of videos going into this in considerable depth. I encourage you to watch them.

https://xiph.org/video/

I am fully aware of those videos and claims. Did you read my posts or the links? What I posted goes vastly beyond what Chris Montgomery wrote, and far beyond his claims.

Again, in different words:

Nyquist assumes infinitely precise samples. It's math, not computer sampling. This never happens, since samples are quantized. Having samples at the proper number of Hz is useless unless enough precision is there, and non-infinite precision implies the original signal is never reconstructable. We're dealing with computers, not the real numbers.

Take a pure sine wave. Sample it mathematically. Quantize those values. Now what sin value reproduces those quantized values? None. Never. They are rational numbers - it is mathematically impossible to fit a single sine wave to them, since sine is a transcendental function. End of story.

Sine is a transcendental function, so rational inputs (other than 0 in the case of sin) do not (except for input 0 for sin) give rational output. So you cannot sample it to perfection with a digital device. You can approximate it. That approximation matters. Digital sampling takes rational input deltas (sampling rate) and necessarily obtains imperfect samples, since you quantized the actual value of a sine wave. So Nyquist fails.

Yes, for a bandlimited signal of a given frequency, Nyquist lets you reconstruct that frequency given infinite precision. This NEVER happens in practice, since it assumes infinitely precise samples. Montgomery ignores this (and a host of other issues - he's at level 2 of a 100 level tower. People at level 0 see his videos and assume there are only 2 levels to the tower). Bitdepth matters. Nyquist does nothing about amplitude quantization, which is needed. It ignores the path to reconstruction - Nyquist only applies to a perfect (not floating point or integer) reconstruction of the signal. Nyquist does not deal with the fact that the quantized values, when pushed to any physical device used to reconstruct audio, is more like stairsteps than pure sine waves.

Most physical devices performing playback are more stairstep than smooth sine values, so they are not reconstructing the input signal - another issue that matters. Input signals are (nearly, up to quantum level) infinitely precise in amplitudes - output devices tend to be more quantized.

Please read the thread I wrote and think through it. I posted a link to a good discussion, I posted a simple experiment or two you can do, I posted (here) a simple mathematical exercise showing that Nyquist fails for this.

1. Montgomery talks about quantization noise. He doesn’t assume infinite precision.

2. DAC output is not stairstep. There’s an analog reconstruction filter that filters out the ultrasonic components, thereby getting back the original smooth waveform. You should read up on DAC design.

>For example, suppose you have a pure sine wave, sample it with enough density to make it mathematically reproducible, then play back those quantized samples on a piezo making square waves - it sounds pretty good (and can be indistinguishable to most ears), but it is not the same waveform.

I'm pro analog, but the above is a common misconception, usually caused by the bad "popular science" articles on digital reproduction and sampling, which show quantization as little pixelated waveforms etc.

The waveform produced from a sampled digital signal recreates the original perfectly (for the target frequency), you can verify that on an oscilloscope.

I just went through this - you cannot. Sine is transcendental. For no rational inputs (other than 0) is the output rational. Sampling at discrete timesteps and quantizing is giving you not samples of a sine wave, but samples close to a sine wave. The reconstruction is not the original sine wave - it's no longer meeting Nyquist theorem requirements.

The oscilloscope can detect that the pattern of inputs has a frequency, but it's necessarily an approximation at this point. An oscilloscope adds enough noise and error from it's own workings that on a screen, to your eye, for certain sampling parameters, it looks close. But that is not the original signal.

Take a signal, sample it, DAC it, and try using that signal to cancel the original, amplify the result, and run that through an oscilloscope. If the signal were reconstructed, you should be able to get zero.

You don't. Now you see the differences in the reconstructed signal and the original.

On the flip side of this issue, I have been rediscovering my love for the sound of old pop tunes on an AM radio.

1008 AM is a great station available through this webSDR service: http://websdr.ewi.utwente.nl:8901/

You failed to understand Claude Shannon. Look up "aliasing distortion".
He disagrees, but it's not about the mathematics.

The "cold and thin" vs warm and thick lies on the nonlinearities, the subtle saturation you get on tape/vinyl, the patina, even the tactility. The ease of skipping makes it even worse.

And that's before we've got to the cultural changes in attitudes towards music between the vinyl/tape and the cd/streaming eras (where music, which was king, is now just another passtime for teens), nor the changes in musical content.

But what about your own ears?

To hand me this secondhand abstraction contrived by this guy I never met. It's a bit crazy.

Our own senses deceive us <<constantly>>. We have a literal blind spot in the middle of our eyes yet we aren't even aware of it unless pointed out through precisely designed experiments...
The moderators removed my reply, the parent.

Shadow-removed in fact. Which is to say, they removed it without warning, notice or explanation. And from my account-view, it was not removed at all.

And when I asked what's up with that, I was threatened with banning "if I keep it up".

What do you think of that?