Hacker News new | ask | show | jobs
by mikebennett 4787 days ago
While we could use STT (some of our team have backgrounds in it) we sought to use the cleanest existing signal, i.e. closed captions.

Part of the motivation for taking a statistical NLP approach is that it gives us more flexibility for processing foreign stations / languages (we don't yet do that).

I wonder could you time and geo-shift closed captions, i.e. show closed captions in two languages at once on the same TV program? That could make an interesting language learning tool and an interesting training set for machine translation.