Hacker News new | ask | show | jobs
by michaelt 2338 days ago
Given the performance of Google's auto-captioning, I suspect developing worthwhile auto-captioning is pretty difficult; according to [1] the better Youtube channels use gig economy captioning at $1 per minute.

Of course, there would be some scope for efficiency - no need to pay for captioning when the performers aren't speaking!

[1] https://www.wired.com/story/problem-with-youtubes-terrible-c...

3 comments

While Youtube auto captioning has a superior performance compared to Google Translate output. In youtube translations there is much more language specific nuance. In many European languages a more formal word for the English word "you" exists, in German it is "Sie", French "Vous".

Correct interpretations of these nuances are applied by Youtube, Google however translates "you" by default to the most formal option.

So to me $1 dollar per minute sounds like an awesome deal because there is not much to adjust.

Out of curiosity, can you link to such a gig company?

Youtube auto captioning doesn't work for British Accents especially anything Northern English, Scottish or Welsh.

The same with any voice control, It doesn't work with my west country accent.

I don’t know if it’s a gig company necessarily. As I understand it they don’t use Amazon Turk anymore but I use CastingWords and have been very happy with them. I think they’re just English though.
> Out of curiosity, can you link to such a gig company?

The one mentioned in the article is https://www.rev.com/

Presumably an option would be, if they want to self-host, is to run the audio through an ML service and post the link. Doesn’t really work for user-uploaded video as well. But it would be pretty easy to build into a content upload pipeline at low cost.

As others say it’s not perfect but I have to believe it would be good enough and probably isn’t a bad idea in any case.

That’s what I do for podcasts. Machine translation is getting pretty good but, for publishing an interview transcript, I’d still be spending a lot of my time to clean it up. For a human transcript it’s maybe 15 minutes work.