Hacker News new | ask | show | jobs
Show HN: A retrainable subtitle synchronizer you can now build your own (subaligner.readthedocs.io)
61 points by icer2020 1962 days ago
10 comments

Can't wait for the Kodi plugin :-)

Unsynched subtitles are hell. I don't know how Kodi knows that a subtitle in open subtitle is "synched". Looks like magic to me.

I use open subtitles with vlc, you just search by hash and if you have a match it usually is synced
A hash of what ? Torrents ? There are often 23.9fps and 25fps versions which are not interchangeable.
A hash of the file you are playing. Usually from torrents
Oh nice to have that on the roadmap.

Subaligner does respect the original frame rate. How the player is going to interpret the subtitle timecode is hard to know though.

The worst are subtitles that start out in sync and then slowly drift.
Is this same-language, or does it work for translated subtitles?

I have some Korean TV DVDs with Japanese subtitles I was trying to use as a guide for adding in US subtitles that came from a different source. I assumed that aside from commercial break timings (one break in the middle) and start/end gaps they should be the same, but my results automating based on timing clusters didn't work too well. Some subtitles were split into multiple segments, sometimes grunts or ambient noise had subtitles, etc.

I might try this out.

The model was trained with features of human voice bound to a frequency range so it may work for "cross-language" sync. Why not give it a go and check the quality? It won't change the content of original segments but only shift them along the timeline if there are gaps.
Just realised another user reported that it did not work well for Russian movie and Polish subtitles. Nonetheless, it doesn't stop you from training your own subaligner with those media assets you possess.
Looks like it's language-agnostic (or at least ALASS is).
Now you can customize and train a new synchronizer using your own subtitles and audiovisual content: https://github.com/baxtree/subaligner
You list aeneas as a dependency here: https://subaligner.readthedocs.io/en/latest/acknowledgement....

That makes me wonder how much of the work is being done by aeneas vs. your own model. If parts of the audio are in a low voice or noisy (which tends to cause aeneas to slip) will subaligner be able to fix that?

Aeneas is used for stretching pre-synced subtitle cue blocks, which is still experimental in subaligner. If durations of cues are correct in you case, there is no need to use this feature and passing in flag -so will switch it off. So why not get rid of stretch and see the difference?

What I know is Aeneas is using DWT which does not guarantee triangle inequality and for low voice or noise, DNN can handle those better with good enough model capacity.

s/DWT/DTW/g
Another alternative (written in Rust): https://github.com/kaegi/alass
A useful link. Need to do some comparison against different genres. Subaligner is just yet another tool and not one of a kind.
This worked great for me, thanks for the share!
I tried it with a Russian movie [1] and Polish subtitles. With a single pass it was still off a bit, and dual pass didn't really work super well either. Nevertheless, interesting technology!

1. https://www.imdb.com/title/tt0118767/ (pretty cool movie)

Oh good to know! Never tried that combination before. Maybe this was due to the model pre-trained with the speech in English. Nonetheless, have you tried switching off the stretch with "-so"?
Wow this Alass [1] tool worked really one this though!

1. https://github.com/kaegi/alass

This is magic, how is this done? This is amazing, desynced subtitles are the bane of my existence, if this tool can fix that this day will get even better.

EDIT: Here's how it's done: https://github.com/kaegi/alass/blob/master/documentation/sli...

Does some tool like that exist but only for audio tracks?
The tool uses ffmpeg to load the video, and according to the "anatomy" section it's based on mel-frequency cepstral coefficients, so it's only using the audio to do the alignment.

Feeding it an mp3 might "just work".

Yes, subaligner should work for audio files as it does for video files.
Does anyone know what data the existing model was trained on? And what hardware/drivers it was trained with?

Here is how to retrain the model from your own data:

https://subaligner.readthedocs.io/en/latest/advanced_usage.h...

Maybe it is my bubble speaking, but the only subtitle synchronization I ever needed were either a constant or a continuously increasing/decreasing offset, both easily solved without pretrained deep neural networks.
That's a good bubble to be in. I very often happen upon subtitles that need an offset, a constant multiplier because they run out of sync, and sometimes have a gap somewhere. It's very annoying, and I'm glad there's software to fix them without me having to faff about for ten minutes.
here's another solution: https://github.com/readbeyond/aeneas
Much better! This is in C and Python and under AGPL.