Hacker News new | ask | show | jobs
by simias 2949 days ago
I think it's a bit like self-driving cars in the sense that it's good enough to be impressive but not good enough to be actually usable everywhere. Of course self-driving is worse because people seldom die of bad captions.

Google's captioning works well when people speak clearly and in English. Google translate works well when you translate well written straightforward text into English. It's impressive but it's got a long way to go to reach human grade transcription and translation.

I think when evaluating these things people underestimate how long the tail of these problems is. It's always those pesky diminishing returns. I think it's true for many AI problems today, for instance it looks like current self-driving car tech manages to handle, say, 95% of situations just fine. Thing is, in order to be actually usable you want something that critical to reach something like 99.999% success rate and bridging these last few percent might prove very difficult, maybe even impossible with current tech.

1 comments

What's important to remember, I think, is that we should not compare YouTube auto captions to human made captions, because auto captions were not created as a substitute for human made captions - if it wasn't for auto captioning, all these videos wouldn't get any captions at all. They may never be perfect, but they're not designed to be, they're creating new value on their own. And IMO they crossed the threshold of being usable, at least for English.