Hacker News new | ask | show | jobs
by thirdhaf 5002 days ago
Assuming 44.1kHz sampling rate the smallest path length difference between the ears is approximately 6.8mm. This corresponds to 2.3 degrees (with sounds source at infinity). Humans can place sounds with about 3 degrees of resolution so unless you have some citations I am seriously skeptical about the claim that higher sampling frequencies gives you anything whatsoever.
1 comments

44.1KHz is not the output frequency; it's the sampling frequency from which an unambiguous 20KHz signal (of sufficient amplitude) can be reconstructed. So your calculation needs to take into account that the maximum peak resolution is 20Khz (and that 16 bits may not have sufficient amplitude sampling resolution to accurately interpolate the waveform peaks even if their frequency -- their spacing rather than their position in time -- can be reconstructed unambiguously).
Completely agreed about 44.1kHz being a sampling rate. To make the rest of the numbers nicer I'm going to pretend we're talking about 40kHz sampling for the rest of this comment. It's clear that you can't distinguish between a 0-degree phase shifted 20kHz signal and a 0-degree phase shifted 60.0kHz signal (I thought we weren't going to talk about Nyquist here). However by the same token you CAN represent a 90-degree phase shifted 20kHz just fine. By way of example consider the bit stream (I'm also pretending 1-bit sampling for now) [0 1 0 1 0 1] which in this format represents a 20kHz signal. It's pretty obvious that we can represent the 90-degree phase shifted signal as [1 0 1 0 1 0] and there's nothing in the input stage that stops the original source from creating this particular type of signal. In fact we can unambiguously represent ANY phase shift in the same fashion, but as you point out then we start arguing about whether your sample has enough precision to unambiguously represent the input. With a 90-degree shift (corresponding to a ~23usec delay) we have none of the precision problems you allude to.
But then you have to ask what actually matters, if the final delivery destination of this information is the ear canal of an adult human.

Given the speed of sound through air, how does a ~23usec time offset compare to the involuntary micro-movements of the head of a typical conscious human?