Hacker News new | ask | show | jobs
by wwalexander 1893 days ago
Back in 2004, someone released an unauthorized fan dub/narration of the the first Harry Potter movie called Wizard People, Dear Reader that hilariously butchers the entire plot, character names, and motivations. This narration is (loosely) synced to the movie and its scenes, and for a long time I watched it via clips uploaded to YouTube. But when Sorcerer’s Stone got a 4K release, I decided it was time to rip it and create my own canonical copy.

The original audio files of the dub were at this point still available on archive.org. The problem is that the second audio file is not meant to play directly after the first - this was back in the days of CD players, so halfway through the movie the dub instructs you to begin playing the second CD once the next scene starts. The other problem is that the second file is louder than the first.

Most sources I saw online said to insert a gap of three seconds to account for the delay, and didn’t have a solution for the difference in volume. I wanted to be more precise.

First, I found the exact start time of the scene where the second audio track begins:

    ffprobe hp.mkv
    ...
        Chapter #0:18: start  4428.882000, end 4817.521000
        Metadata:
          title           : Chapter 19
    ...
Then I compared this with the duration of the first audio track:

    ffprobe wiz1.mp3
    ...
      Duration: 01:13:45.33, start: 0.000000, bitrate: 66 kb/s
    ...
The difference between these time stamps gave the actual delay of 3.582 seconds.

I then compared the maximum audio levels of the two audio tracks to determine the level to increase the first track’s volume by (there are more advanced features in FFmpeg for volume normalization, but I just wanted to remove the potential for eardrum damage when beginning Chapter 19 and keep things as similar as possible otherwise):

    ffmpeg -i wiz1.mp3 -af volumedetect -f null -
    [Parsed_volumedetect_0 @ 000001ecf871c1c0] max_volume: -11.1 dB

    ffmpeg -i wiz2.mp3 -af volumedetect -f null -
    [Parsed_volumedetect_0 @ 000001858b881880] max_volume: -3.6 dB
This gave me the volume increase for the first track of 7.5 dB.

Once I had these numbers, it was time for the one-liner to adjust the first track’s volume, concatenate the two tracks with the gap of silence, and mux them with the video from the movie:

    ffmpeg -i hp.mkv -i wiz1.mp3 -i wiz2.mp3 -filter_complex "[1]volume=7.5dB[wiz1];aevalsrc=0:duration=3.582:sample_rate=22050[gap];[wiz1][gap][2]concat=n=3:v=0:a=1,apad[wiz]" -map 0 -map "[wiz]" -shortest -c:v copy -c:a flac -sample_fmt s16 -f matroska wiz.mkv