Hacker News new | ask | show | jobs
by ndesaulniers 1712 days ago
The issue with "expensive to calculate" values like the duration of media (for example, variable encodings) is that the encoder tries to help others avoid rematerializing these values by saving its calculation in some metadata. The problem is consumers then have to "trust" the encoder; this post demonstrates a non-malicious case, but perhaps there are more malicious cases (like the vulnerability in Android's libstagefreight years ago).

For example, I wrote an iTunes-in-the-browser web app; I needed to know durations of songs to display them. MP3 doesn't include these in metadata IIRC, so I needed to pre-process them with ffmpeg just to have duration data. I wasn't doing anything with that other than displaying it. But it would have been nice to just have that info in the metadata.

2 comments

> For example, I wrote an iTunes-in-the-browser web app; I needed to know durations of songs to display them. MP3 doesn't include these in metadata IIRC, so I needed to pre-process them with ffmpeg just to have duration data.

This jogged my memory from (part of) the first thing I ever built in a general purpose programming language, all of probably 20 years ago! I was doing exactly this: using ffmpeg to get duration metadata from MP3s.

My memory was fuzzy so I looked it up, which (surprisingly!) confirmed what I remembered. MP3s may include metadata (ID3) which may include duration (or start/end times).

I knew my input source (it was me, my music, my MP3 conversions), so I was able to rely on the metadata directly. IIRC I even processed it on demand in my first naive version, which was “slow” but not nearly as slow as stuff I’d complain about today.

I ran into a similar issue when I tried to generate a podcast RSS feed from a website whose built-in feed didn't go back far enough. I was trying to do HTTP range requests on the mp3 files to save bandwidth and just fetch their metadata. Sure enough, mostly no duration and if the encoder did put it in a custom field it was usually different than what VLC says.