Hacker News new | ask | show | jobs
by tsukikage 2341 days ago
How close to correct do the inputs need to be?

Can it cope with subtitles that are correctly ordered but all have a timestamp of 0, or timestamps one frame apart?

1 comments

Hi, tsukikage,

It works more effectively on subtitle segments with initial inaccurate timecodes and gaps (non-speech) in between. The scenario you described is equivalent to when you have a sequence of words with no associated timecodes. Subaligner is not implemented for tacking this problem but it has incorporated forced alignment at the second aligning stage. This feature is experimental and only for English atm but feel free to give it a go. There is a nice summary of forced-alignment tools if you only have a sequence of words as the input: https://github.com/pettarin/forced-alignment-tools