Hacker News new | ask | show | jobs
by jchw 2197 days ago
Nitpick: “audiophile-quality sound” it seems, is becoming the new “military-grade encryption.”

I don’t have many other comments to make other than I am surprised rust-analyzer was only mentioned in passing.

9 comments

As far as I'm concerned "audiophile" has been synonymous with "overpriced placebo" basically forever.

Beyond that I wish the article had explain a bit better why it chose these "better-than-std" crates. I'm actually using all the std variants in my projects, I'm curious to know if I'm missing out or if I just happen not to hit on their limitations.

> Beyond that I wish the article had explain a bit better why it chose these "better-than-std" crates.

At least for parking_lot, its README has a long list with its advantages over std: https://github.com/Amanieu/parking_lot/blob/master/README.md

I saw that but I was interested to know if TFA had decided to go with it because it looks better on paper or if it's because they hit a roadblock using the std counterparts and migrated to using those.

That being said since they're drop-in replacements for the most part I suppose I could just try to rebuild my project with this crate and see if I notice a difference performance-wise.

In our case, it wasn't a matter of hitting a specific roadblock as much as it was past experience in performance-sensitive projects, and knowing that if we start with those crates we'll probably be fine, and if we don't, we'll probably switch to them eventually for some reason.

With crossbeam for example, you can hit roadblocks with std since their channels are MPSC, whereas crossbeam supports MPMC channels (and is faster than std in every meaningful measurement last I checked).

That's great to know, thanks!

Reading the description it almost seemed too good to be true but if it's indeed objectively better in basically every situation I should probably give it a try.

Might as well pick another language for your project if the current one has such shit standard libraries and you're "learning it on the job".
What's wrong with learning a new technology on the job?

Rust leaves a lot of improvements to its standard library to the community, so these improvements start off as separate libraries for faster iteration. The most recent example I remember is the hashbrown crate replacing the standard HashMap.

Does that mean the other language's standard library is better? That it has better third party libraries?
You're right, that sounds way too fluffy.

To clarify, we're targeting "transparent" sounding audio, not "FLACs or bust" audio. Right now we send stereo 48kHz 96kb/s Opus (CELT, not SILK) that we found hit the voice transparency sweet-spot compared to the lossless audio source. We had used higher bitrates in the past, and could easily go back to them, but quality plateaued at around 96k in our experimentation.

More than choosing sane transparent-sounding encoding parameters, the biggest difference in fidelity by far was choosing the correct microphones and speakers for accurate reproduction of voices.

Voice does not extend above 22.05khz, so using sampling rates above 44.1khz is entirely objectively wasteful and useless, unless your codec only works at 48khz input or something.

Are you using 48khz for a specific reason?

Please read the official Opus FAQ to sampling rates: https://wiki.xiph.org/OpusFAQ#But_won.27t_the_resampler_hurt...
44.1 kHz is essentially deprecated on the hardware level since it's annoying to deal with the extra clock. It's a few cents for an extra crystal, way too expensive ;). 44100 also makes for very poor multipliers/dividers to other clocks since it includes 3²×5²×7² as factors. 48000 is much nicer with 3×5³.
The issue with 'military-grade' is that anyone in the military will attest it translates to: Cheapest possible thing that gets the job done.

Audiophile grade at least has roots in high fidelity.

> Audiophile grade at least has roots in high fidelity.

Does it though? Audiophiles generally seem to eschew fidelity in favour of something that sounds subjectively nice, including the psychoacoustic effects of spending a lot of money.

Eg. they seem very fond of "warmth". If you asked me to make something sound "warm", I'd be applying some soft clipping and dampening the top end, not eliminating sources of distortion.

Edit: If you actually wanted high fidelity, you'd use studio headphones / monitors, which are designed to be "unflattering", so you can be confident you'll hear any issues when mixing / mastering. People don't normally listen for pleasure with those, because they become fatiguing after a few hours.

Choosing equipment because you like the sound is a very reasonable thing to do, but it's not the same as pursuing fidelity.

There's all sorts of audiophiles out there. Some hold beliefs rooted in pseudoscience.

And some are all about accuracy and measurements.

For instance, I use Sennheiser HD600[0], which I strongly recommend, attached to Topping DX3 Pro (old model)[1], which I cannot recommend, as the v2 model shipping now is garbage[2], a consequence of a redesign to work around high fault rates. Mine is fine as problem units fail within weeks, and I've had it for years.

[0]: https://reference-audio-analyzer.pro/en/report/hp/sennheiser...

[1]: https://www.audiosciencereview.com/forum/index.php?threads/r...

[2]: https://www.audiosciencereview.com/forum/index.php?threads/m...

Our ears are incredibly sensitive sensors and I think attributing warmth to soft clipping and dampening the top end is not a complete picture.

Also warmth is just a single quality. I have a pair of very accurate “cold” headphones that I prefer for music and a pair of “warm” headphones for electronic music and gaming.

Past the headphones, it is not so much warmth as it is space in the sound for me. My headphone amplifier sounds effortless and that’s the best way I can describe the quality of what I hear.

But those characteristics are based on objective facts of sound reproduction that can be quantified.

The characteristic of warmth is related to amplification of certain harmonics as well as equalization in the signal. This is fairly well understood by now.

The audiophile definition of a "warm" sound signature has nothing to do with distortion and audiophile's do not "eschew fidelity" for different sound signatures.
> The audiophile definition of a "warm" sound signature

I don't really know what, if anything, that means. But if we're talking about fidelity, surely the ideal would be no sound signature? If a particular "sound signature" makes it sound "warm", surely it's decreasing the fidelity?

You're lack of knowledge of this matter is very evident and you're skepticism and confusion would be very easily cleared if you made an actual honest exploration into hi-fi audio
I'm an EE and have made an honest exploration into this topic many times, and yet still have no explanation of "warmth" beyond the addition of distortion resulting in even-ordered harmonics. Which is precisely a decrease in the SNR from input-to-output.

That might sound good! But it's a less-than-perfect reproduction of the source signal.

If there's a better explanation than what I've come across every time I've search for this, I'm all ears and honestly open to being corrected.

You've never listened to audiophile equipment have you? If apply "some soft clipping" it will sound bad, I guarantee you, no audiophile would like it.
> You've never listened to audiophile equipment have you?

You're saying that I ought to judge the merits of audiophile equipment by the subjective measure of whether I like the sound of it. Which is the metric I said audiophiles would favour.

> If apply "some soft clipping" it will sound bad

Soft clipping often sounds nice, which is why it's very commonly applied to music. You're saying that eg. the sound of a classic Vox amp is bad, which I guess you're free to believe if that's what your ears tell you, but it's certainly not an objective truth.

Because what you are describing is a simplistic picture, describing whole class of people as stupid simpletons who cannot tell low THD and low IMD audio from "soft clipping which sounds nice". If you are referring to vacuum tube amps, soft clipping is only partially the reason why they sound the way they do; in fact most of the time amps are not clipping and are outputting close to 1% of their their total power. Reasons why tube equipment sounds better/different from the solid state amps are a lot more complex than the "common wisdom" of soft clipping.
Similarly "medical grade" = "single use" in many actual medical contexts.
3DES is still military grade.
No it's not. It stopped being approved for usage by NIST a few years ago.
Really? 3DES still appears here, https://csrc.nist.gov/projects/block-cipher-techniques, with DES and Skipjack being called out as deprecated.
That page says that 3DES is prohibited from usage in new applications and is prohibited for encrypting more than 1 GB of data, since 2017.

The attached documents have additional information on implementation and (non) usage, including deadline to migrate legacy military systems. It's sadly quite cumbersome to go through the tens of PDF to find the relevant information.

112-bit keys are still allowed precisely because of 3DES.
It can join the ranks of meaningless phrases like "aircraft-grade aluminum", "chef-grade cookware", and "contractor-grade tools".
> Nitpick: “audiophile-quality sound” it seems, is becoming the new “military-grade encryption.”

It's too bad they didn't explain it. I expected they meant allowance for "full bandwidth" audio (possibly including music you can listen to).

Video conferencing systems generally use voice-only codecs compressed to shit, full of artifacts in the voice range and utterly dead outside of it.

To me, "military grade encryption" means following industry standard. "Audiophile quality" means higher quality than you need, care about, or can even tell apart from lower quality.
No, "military grade encryption" means nothing. If it referenced a standard, than that might mean something. I've worked on products for the military that still used single pass DES encryption. So that was military grade. It might as well have been ROT13.
especially because all VoIP codecs sound like shit. It's intelligible, but the bar isn't high for fidelity.
Whose ears and which military? :D
yeah audiophile can be so may things. To me it means 24bits or more.
More than 21-bits is meaningless. It's all hype beyond 24-bits.
I believe that's the full dynamic range that human hearing can possibly process where it's a really tiny signal that a human can actually hear with noise underneath vs a really loud signal that is basically pain. Most humans don't have that range. Note that the issue is that the quiet signal needs to be above the noise--so whatever your signal is, the noise floor needs to be below the threshold of hearing given that signal (I believe that while for "normal" signals that noise floor needs to be more than -50 to -60dbm down for very quiet signals threshold of detection is only -20dbm further down).

The trick is that our hearing systems are logarithmic (we can't hear a quiet sound next to a loud sound--that's what compression relies on), so they map to floating point numbers better (ie. 16-bit floating point is way more than enough).

24-bits is effectively for recording engineers so they have lots of headroom and don't have to worry about clipping basically at all (6dbm per bit implies about 18dbm of extra headroom which is a LOT).

However, when you calculate non-linear audio effects, you want extra bit depth (generally floating point) because cancellation and multiplication in your intermediate results can really move your noise floor up into bits that humans can actually hear.

While I can't argue for 21 specifically, I definitely don't trust everything to use careful dithering and guarantee full quality in 16 bits. So in practice that's 24 bits at forty-something kilohertz.
You might be right, however 16-bit sounds really harsh to my ears, and 24-bits is the only widely used standard, better than 16-bit.
Do you mean “the expression ‘16-bit’ sounds harsh to my ears”, or do you mean that you can hear the difference between 16 and 24 bits per sample?

The effect of bit depth has little to do with how you perceive the sound; what adding more bits does is allowing for more dynamic range, i.e. more difference between the loudest possible and the quietest possible sound. More bits brings down the noise floor. This means that for example the final part of a fade-out retains more detail at 24 bits than at 16, but this difference is not something that you would be able to observe in normal listening conditions.

If you like to learn more about the effects of bit depth, I would recommend “Digital Show & Tell” by Xiph Mont at https://www.xiph.org/video/.

Is there any difference between those two expressions. Overall - yes you are right, 24 do sound better. Loss of details and replacement of them with digital (aggressive, non-random, correlated) noise indeed sounds harsh.
> 16-bit sounds really harsh to my ears, and 24-bits is the only widely used standard, better than 16-bit.

That really doesn't make any sense. The bit depth provides for a dynamic range, meaning the difference between the loudest and quietest sounds which can be encoded. 16 bits is enough to go from "mosquito in the room" to "jackhammer right in your ear". Congratulation, 24 bits let you go up to "head in the output nozzle of a rocket taking off" with room to spare, that's… not very useful?

Now what might make sense — aside from plain placebo — is a difference in mastering. For instance lots of SACD comparisons at the time were really comparing differences in mastering, with the SACD converted to regular CDDA turning out way superior to the CD version because the mastering of the CD was so much worse.

The "Loudness Wars" is an especially bad period of horrible mastering, and it went from the mid 90s to the early-mid 2010s (which doesn't mean that regular-CD has gone back to "super awesome", just that you're unlikely to have clipping throughout a piece these days).

What I said actually does make sense. First of all, if you are digitally lowering loudness of audio (say 4 times), you actually are losing precision, and if you later amplify again - you will never return these bits back. This is what is called headroom. So your typical multiply-by-a-floating-point volume control actually kills dynamic range of the sound. I for example never run my OS volume control and players volume knob at 100% (which would preserve the range), because the gain of my amp is simply to high, and even slightest movement of the amp knob will cause dramatic change in loudness. Therefore, I keep the digital volume controls at 25% (losing 2 bits on the way, but the audio is recorded at 16-bit - losing nothing), and then amplify with my amp. Voila - nothing lost in the process. Secondly, empirically, every time I switch sound cards to 24 bits it sounds better. I have noticeably less fatigue. Of course, someone may want to gaslight me (not deliberately, of course), attempting to force me to think it is a placebo, but I tried with many people, and all of them noted the difference.
So what you're saying "actually does make sense" so long as it's a completely different subject than what one would normally assume in context, without you having mentioned such.

When people talk about 24 bits (and >48kHz) in the context of "audiophilia", it's generally about the data at rest and "HD audio" (aka 24 bit music files and downloads). Not about the bit depth of the processing pipeline for which it's generally acknowledged that yes, >16 bit depth does make sense for the audio processing pipeline (as well as the original recording).

> but I tried with many people, and all of them noted the difference.

Unless this was a double-blind study and the audio levels were exactly the same between runs, this is useless data. Even a 0.1dbSPL difference between runs is noticeable (people gravitate to louder sounds as better).

> every time I switch sound cards to 24 bits

This may be related to the sound card. I use an external DAC, not a soundcard, as most soundcards that come with computers are not up to par.

Changing 16 bits to 24 bits should not change the audio in a way that is discernible to the human ear.

> That really doesn't make any sense. The bit depth provides for a dynamic range ... 16 bits is enough to go from "mosquito in the room" to "jackhammer right in your ear".

Dynamic range is not loudest sound / quitest sound ratio (as would one expect), but loudest sound / noise level ratio. Otherwise you would need to count additional bits to encode quietest sound with low enough quantization noise.

Threshold of hearing could be as low as -9 dB SPL, so one wound want noise level below that. Therefore with 96 dB dynamic range from 16 bits the loudest representable sound would be say 86 dB SPL. But symphonic orchestra music may have peaks way above 100 dB.

I think the bigger issue is likely to be a trash computer mic, a trash preamp/adc, trash dac, trash speakers, trash room. I don't care if at some point you're sampling and sending that signal at 1000-bit or whatever, it's still trash, just very accurately sampled trash.
I disagree, I do not own trash equipment. Every time I install Linux, i switch Pulseaudio settings from 16-bit to 24-bit; the difference is immediate, although subtle. Everyone I know who tried to do this, noted that listening fatigue is a lot lower with the new settings.
In my direct experience, everyone who claims this to me, so far, is unable to distinguish 16 bit and 24 bit recordings in an ABX.

The audiophile world would do well to adopt the concept of double-blind study.