|
|
|
|
|
by alpe
3382 days ago
|
|
Thank you. Indeed several users of aeneas adopted it for producing SRT/TTML files, i.e. captions, for videos, both online and offline --- and many of them start with an existing transcript. However, please note that there are limitations on the amount of "non speech" that aeneas can tolerate: for example, long spurious portions of audio or sung passages might affect the quality of the alignment. For details on how aeneas works: https://github.com/readbeyond/aeneas/blob/master/wiki/HOWITW... |
|
couldn't you have as part of the input also a very simple map where users could define times that should be ignored to help with that? Might also be possible to look at the spectrum at any time to possibly identify areas of the file to skip.
And speaking about spectrum, just wondering, are you doing any pre-processing in terms of EQ (narrow band-pass on spoken frequencies), compression to not deal with volume, etc. to help with this also?