Hacker News new | ask | show | jobs
by jeroenhd 844 days ago
Some of the best subtitles I've ever seen were on Tom Scott's YouTube channel. They use different colours, indicators for jokes and sarcasm, while also staying relatively close to what's actually been said. They're better than many big-budget movies and TV shows I've seen.

He talked about subtitling at some point, and I was surprised how cheap subtitling services are. I think he went beyond the price he mentioned, but it really made me question why big, profitable YouTube channels aren't spending the small change to do at least native language subtitles that Google can translate, instead of relying on YouTube's terrible algorithm

That said, Whisper seems to generate quite good subtitles that take short pauses for timing into account, but they're obviously neve going to be as good as a human that actually understands the context of what's being said.

1 comments

Whisper can also generate timings at the word level, which you could use to make better-timed subtitles
Yes. But Whisper's word-level timings are actually quite inaccurate out of the box. There are some Python libraries that mitigate that. I tested several of them. whisper-timestamped seems to be the best one. [0]

[0] https://github.com/linto-ai/whisper-timestamped