|
|
|
|
|
by ptx
2146 days ago
|
|
Machine translation can never replace real translators, unless we develop an AI with actual understanding. Even with human-translated texts it's usually noticeable when the translator didn't understand the subject. To make sense of the translated text you then have to try to reverse-engineer the translator's mapping to figure out what the text would have said in the original. Much like how you can't properly parse HTML using only regular expressions and string substitution, you can't truly translate human languages without understanding. You have to parse the input language, process the meaning of what was said and finally serialize to the target language. |
|
Making good subtitles means you prioritize readability over accuracy. You have a limited amount of space for your text, and you want to keep a low characters per second, so you cut words, ruthlessly. But you have to choose which words to cut so that it still makes sense, which means that you have to identify filler words so you can cut them, or figure out ways to re-phrase something into a shorter sentence.
You probably also want to preserve the tone and style of the dialogue, which means you have to choose the right synonyms, not just the most common ones.
And if you're creating hearing-impaired subtitles, it becomes even more necessary to understand what's going on in the video. If someone slams a door center-screen, you can cut that from the subtitles if you have more important things to display, but if someone slams a door off-screen, you absolutely have to include it in the subtitles, because that's the kind of information a hearing-impaired person needs.
Good luck training your little machine-learning network how to identify which sound effects originate from objects on-screen and which originate off-screen...