|
|
|
|
|
by mywittyname
336 days ago
|
|
I think it's a classification issue. Silence is never put in the subtitles of a film, since it isn't necessary. The viewers can tell that nothing is being said if there are actors on the screen. And in situations where there are no actors, then there will be a subtitle to indicate what is going on, like "[rock music plays]". Subtitle authors use this silence to fit in meta information and have done so since the closed captions era. Proper data cleaning procedures would be to strip this meta data from any subtitle sources. Since this wasn't done, this is fundamentally a classification issue. It may also be an over-fitting issue, but that is secondary to the classification problem. |
|