Hacker News new | ask | show | jobs
by stan_rogers 5002 days ago
44.1KHz is not the output frequency; it's the sampling frequency from which an unambiguous 20KHz signal (of sufficient amplitude) can be reconstructed. So your calculation needs to take into account that the maximum peak resolution is 20Khz (and that 16 bits may not have sufficient amplitude sampling resolution to accurately interpolate the waveform peaks even if their frequency -- their spacing rather than their position in time -- can be reconstructed unambiguously).
1 comments

Completely agreed about 44.1kHz being a sampling rate. To make the rest of the numbers nicer I'm going to pretend we're talking about 40kHz sampling for the rest of this comment. It's clear that you can't distinguish between a 0-degree phase shifted 20kHz signal and a 0-degree phase shifted 60.0kHz signal (I thought we weren't going to talk about Nyquist here). However by the same token you CAN represent a 90-degree phase shifted 20kHz just fine. By way of example consider the bit stream (I'm also pretending 1-bit sampling for now) [0 1 0 1 0 1] which in this format represents a 20kHz signal. It's pretty obvious that we can represent the 90-degree phase shifted signal as [1 0 1 0 1 0] and there's nothing in the input stage that stops the original source from creating this particular type of signal. In fact we can unambiguously represent ANY phase shift in the same fashion, but as you point out then we start arguing about whether your sample has enough precision to unambiguously represent the input. With a 90-degree shift (corresponding to a ~23usec delay) we have none of the precision problems you allude to.
But then you have to ask what actually matters, if the final delivery destination of this information is the ear canal of an adult human.

Given the speed of sound through air, how does a ~23usec time offset compare to the involuntary micro-movements of the head of a typical conscious human?