Hacker News new | ask | show | jobs
by shampto3 845 days ago
You’re right, but I fear this idea has become prevalent in audiophile communities where they only want to listen to files that are 96kHz or higher.

In my opinion, having a high sample rate only really matters during the production phase and does not have a noticeable effect on the final form factor. If the producer uses high sample rate during the creation process, I see no reason why the listener would care if the file they’re listening to is higher than even 44.1kHz unless they are planning on using it for their own production.

2 comments

People should prefer 48k over 44.1 but not for fidelity. It would just make the world a better place if 44.1k audio files died out. The reasons it was chosen are invalid today and we're stuck with it, and now every audio stack needs to be able to convert between 44.1/88.2 and 48/96 which is a solved problem, but has a tradeoff between fidelity and performance that makes resampling algorithms a critical design feature of those stacks.

All because Sony and Philips wanted 80 minutes of stereo audio on CDs decades ago.

https://en.wikipedia.org/wiki/PCM_adaptor

It's very likely that the 44.1 kHz rate comes from the PCM adaptors that were designed to take PCM audio and convert it to something that a video tape recorder would accept.

I watched a YouTube a few months ago about these adaptors and the presenter did the calculations showing how the 44.1 kHz 16-bit sample rate lines up with the video fields. There was a valid engineering reason for this sampling rate.

However, the stories about one of the Sony executives having a particular piece of music in mind are true, and have to do with the diameter of the disk being enlarged compared to what Philips originally had in mind. By that time the bitrate was already decided.

I still agree that 48 kHz is a better choice today, especially after reading this paper.

Beethoven's 9th.

> Kees Immink, Philips' chief engineer, who developed the CD, recalls that a commercial tug-of-war between the development partners, Sony and Philips, led to a settlement in a neutral 12-cm diameter format. The 1951 performance of the Ninth Symphony conducted by Furtwängler was brought forward as the perfect excuse for the change,[76][77] and was put forth in a Philips news release celebrating the 25th anniversary of the Compact Disc as the reason for the 74-minute length.

https://en.wikipedia.org/wiki/Symphony_No._9_(Beethoven)#Com...

What _is_ the reason people should prefer 48k over 44.1k though?
To avoid the required non-integer resampling in software, as everything but music has basically standardized on 48k, and most platforms default to it.
All TV and computer audio runs at it, raise for TV/Film purposes 48000 is a very nice round number.
While audio equipment and algorithms don't care about nice-looking numbers, I think the actually useful property is that 48000 has more favorable prime factors 44100 which can be a useful property for resampling and other tasks.
The same could be said about bit depth: 24 bits offers far less quantization artifacts than 16 bits, and those artifacts can readily show up during production processes such as dynamic range compression, but they are extremely well hidden by dithering with noise shaping which gets applied during mastering so ultimately listeners are fine either way.

However, any type of subsequent processing in the digital domain, even just a volume change by the listener if it's applied digitally in the 16 bit realm (i.e., without first upscaling to 24 bits), completely destroys the benefit of dithering. For that reason, we might say that additional processing isn't confined to the recording studio and can happen at the end user level.

I'm unsure whether this same logic applies to sampling frequency, but probably? I guess post-mastering processing of amplitude is far more common than time-based changes, but maybe DJs doing beat matching?

I detect some fallacy here.

The real benefit is not using 6x network bandwidth, storage, memory, processing power and more battery of the mobile device. That benefit is not going anywhere, no matter what.

Post-processing is applied to the signal which is physically impossible to distinguish from the source. It is true that it often needs higher resolution, and DSPs will upsample internally and then back and operate on floats. But to claim without evidence, that post-processing may give human listener back the ability to tell apart whether 192/24 medium was used instead of 48/16, would be to reintroduce the same quality-loss paranoia, just with an extra step. If one couldn't hear the difference before an effect was applied...they won't hear it after.

As for DJs, they do use high-res assets when producing mixes. That's still mastering stage, technically.

With music, in particular, if you use any analog sources while recording, the signal will contain so much noise that any dithering signal will be far below the floor and will most likely be completely redundant. I know that people claim to hear a difference, but they also claim to hear a difference between gold and copper contacts.
I hear no difference between undithered 16 bit and anything "better" (e.g. dithered 16 bit, or more bits) and anyone who claims they do should be highly scrutinized, when we're talking about a system (media, DAC, amplification, transducer, human) playing a mastered recording at a moderate volume setting. But I certainly hear the difference (as quantization artifacts) when cranking the volume up to extremely high levels when the source material is extremely quiet, like during a fade out, a reverb tail, or just something not properly mastered to use the full range; setting the volume to something that would totally clip the amp, blow the speakers, or deafen me if it weren't a very quiet part of the recording.

Dithering (or more bits) does solve for this. A fade out of the song also lowers the captured noise floor, but the dither function keeps going.

It's akin to noticing occasional posterization (banding) in very dark scenes if your TV isn't totally crushing the blacks. With a higher than recommended black level, you will see this artifact, because perceptual video codecs destroy (for efficiency purposes) the visual dither that would otherwise soften the bands of dark color into a nice grainy halftone sort of thing which would be much less offensive.

Bit depth is only useful at reducing the noise floor, the lower the bit rate the higher the noise.

That’s why producers (mixing many tracks in a session) want to use high bit rate stems, because they are summing the noise from n tracks.

It’s a pointless exercise for DJs or anyone listening to a single source to use a higher bit depth.

> even just a volume change by the listener if it's applied digitally in the 16 bit realm ...

I think that "if" is doing a heavy work here.

Maybe it's an uncommon scenario these days, but not too terribly long ago I think it was fairly typical for software audio players to be 16-bit and offer volume controls, and anything other than 100% would completely ruin most benefits of dither.