Hacker News new | ask | show | jobs
by bartread 3206 days ago
And I'll third it.

The MP3 issues don't end there, which is something the article touches on obliquely: you can't reuse many of the important constructs you might want to.

Here's my use case. I have a couple of games (https://arcade.ly/games/starcastle, https://arcade.ly/games/asteroids), each of which has three pieces of music: title screen, in game, and game over. If you play the game a couple of times you're going to hear the title screen audio probably once, in game twice or more (because it loops from the beginning after every playthrough), and game over twice. To put it simply: I need to play the same MP3s multiple times each.

To play an MP3 you have to decode it, which is an expensive operation. Firstly it takes time to decode - enough time that the user will notice the lag even on a fast machine. However the main problem is the amount of memory use: decoding takes you from a couple of MB of compressed MP3 to potentially hundreds of MB of uncompressed audio. The problem worsens for multiple tracks.

I discovered the memory issues via Chrome Task Manager, when I noticed my page using hundreds of MB of native memory, and traced this usage back to the music. You can often get away with this when running on a desktop browser, but not so much on mobile.

You can mitigate the memory issue to some extent by dropping the sample rate of your uncompressed PCM audio to 22.05KHz, which obviously halves its uncompressed size. Quality starts to suffer too much for music if you go much below this though. (Note here that I'm talking about the uncompressed sample rate, and NOT the MP3 bitrate. A 44.1KHz MP3 encoded at 64Kbps and one encoded at 128Kbps will decompress to the same size, although the 64Kbps version will obviously sound worse because more information will have been lost.)

But the inability to reuse a source buffer, which holds compressed audio, is absolutely aggravating, and something I've posted at length about here: https://github.com/WebAudio/web-audio-api/issues/1175. The reason you might want to do this is because it means you're only using as much memory as the compressed audio takes up and (hopefully) the rest will have been freed by the browser's runtime (no guarantees, obviously).

The downside of this approach is that you can't start a piece of music at a defined instant, which is extremely frustrating when you might want to synchronise it with events happening on screen.

Also, due to the re-decoding every time, and the asynchronous nature of such, I've now introduced a weird bug where it's possible to end up with both title and in game music playing at the same time if the user starts the game before decoding the title music is complete. It's fixable (although I haven't had time yet), but it's just one more irritation with a poorly designed API.

I'm actually thinking of going back to using the good old HTML5 AUDIO element just for playing music, since it seems a bit more reliable, but I need to do some experimentation to see what the memory impact is. I also had issues with AUDIO misbehaving quite badly in Firefox with multiple sounds playing simultaneously.

Sound effects are less of an issue because they're obviously quite short and therefore don't take an excessive amount of memory even when uncompressed, so I can at least keep buffer sources around for them. Nonetheless the API's excessive complexity shows through even here: why is it such a drama just to play a sound? Why do I need to create and connect a bunch of objects together just to play a single sound at a given volume? Ridiculous. Asinine.