Hacker News new | ask | show | jobs
by kojackst 2673 days ago
Sorry for the dumb question, but what exactly does this do?

`subsync reference.srt -i unsynchronized.srt -o synchronized.srt`

I mean, I already have a synchronized srt file, so what would I be syncing here?

4 comments

Great question. The main use case I can think of is when reference.srt and unsynchronized.srt are in different languages, and you want to eventually merge them into a single dual-language subtitle file.

EDIT: Oh, I should also mention that you don't need a reference.srt -- it can look at the video directly and use that as a reference.

Ohh now I got you. I believe README should be improved in this part then.

It reads:

====

Although it can usually work if all you have is the video file, it will be faster (and potentially more accurate) if you have a correctly synchronized "reference" srt file, in which case you can do the following:

subsync reference.srt -i unsynchronized.srt -o synchronized.srt

====

I believe you should explain that if you have a reference file in another language which is correctly synchronized with that video, you can use that file instead of the video, as its timestamps will serve as references when synchronizing the target .srt file.

Now this has raised a question, what if the reference file has a different block count? For example, in some languages (like Chinese or Japanese) we can say a lot with fewer characters than in English. So in Chinese a text will stay on the screen for a long time, whereas in English the corresponding text would be split into two or more blocks. Wouldn't that make synchronization less accurate?

BTW that's a cool project. Thanks for sharing!

Really appreciate the feedback -- will definitely work on improving the clarity in the README. Glad you like the project!
Sorry, I forgot to answer your last question. It turns out that, because of how the algorithm works, the number of blocks shouldn't matter. Since it is discretizing time windows in 10ms increments, the granularity of the "effective blocks" is small enough that putting two separate large blocks on the screen, each for a shorter period of time, is roughly equivalent to putting a single large block on the screen for twice as long (for synchronization purposes, that is).
> when reference.srt and unsynchronized.srt are in different languages

I just want to thank you for including this use-case, because it's exactly the thing I'm regularly running into. Subs in one language are bundled with the vid, all subs from OpenSubtitles are desynchronized.

You are most welcome! This is exactly the use case I was initially targeting; I got super lucky that the VAD-based synchronization happened to be low-hanging fruit that has a higher "Wow" factor.
Given it just processes audio, how long does processing one video file usually take?
Audio extraction is actually the most expensive part. It depends on the length of the video, but my experience is that it finishes in 20 to 30 seconds. It's possible one might be able to sample different parts of the video to bring the runtime down further; I plan to experiment with this when I get the chance.
It's quite common for different people to create their own subtitles for videos, and for all of these different subtitle versions be available for download at various subtitle sites on the internet.

If you've already synchronized one of these from the video itself (by using the voice detection algorithm described in the Readme, or maybe even by hand), it looks like you can then synchronize the rest using the already synchronized subtitle file. It's probably faster.

I haven’t tried this command, but I believe the use case is as follows:

• Open audio/video file with —supposedly— synchronized file

• After a few seconds, I realize the subtitles appear before/after the dialogues

• I immediately close the multimedia player, and open the Terminal

• I execute the “subsync” command which does who knows what

• Open the SRT and discover that the subtitles are now in the correct timestamps

• ???

• Profit

Pretty much! Given the anecdotal success I've personally had with this approach, I'm hoping it could get picked up by VLC so that the algo can be run in there directly.
Previously: Fiddle with the subtitle/audio offset factor (and then they drift apart again slowly, driving you mad!)

Now: subsync

Soon: Players run subsync internally the press of one button or commandline switch.

The voice audio detection and then mapping is such a neat solution. I would have embedded parts of the surrounding audio in some base64 format into the subtitle file and then used that as an alignment clue. But this won't work when the languages don't match.

Actually, it unfortunately doesn't work if the drift gets worse over time -- so far, it only works with constant drift. Maybe fixing constant 1st derivative drift is the next step!
The drifts come from different speed of videos compared to the speed of the videos for which the titles were done.

E.g. if one subtitle is made for 24 frames per second speed (classic film speed) and you have a video presented in 25 frames per second (common in Europe). The original two hours video in 24 fps is then 5 minutes shorter in the Europe-origin version. Or the opposite: the subtitles for 120 minutes would at the end appear 5 minutes before!

Apparently there are some other speed changes, for which I don't know how they happen.

I have done one such correction once, using the linear function to model the correction ^based on the target times of the first and of the last title.

> only works with constant drift

I hoped that this solution would sync parts that have really different offsets between the audio and the subs, including changes from negative to positive offsets. Because that's the cases where automatic fixes in e.g. Aegisub don't suffice.

This happens when the vid and the subs are from different releases which apparently were edited for some reason―regional releases or something. Like, after some point the subs are suddenly off by a minute.

Oh! Yeah, that makes absolutely perfect sense! I just asked the same question here regarding the difference in FPS of video and subtitle.
I'm guessing `synchronized.srt` is just the name of the output file.
I think they were referring to the fact that "reference.srt" would already be synchronized.