Hacker News new | ask | show | jobs
by collin08 1774 days ago
I've used this library in the past for prototyping a project to extract Chinese subtitles from youtube videos in a chrome extension. It worked pretty well. The only problem is the library couldn't really handle realtime video. Can't really fault it for that though I was sending it every frame. The throughput was good but latency kept increasing probably because I was giving it to much data.

There's a mode where you can increase the number of worker threads. Tesseract is also designed for text documents and the preprocessing filter I made to convert the images to look more like a text document was pretty naive.

I'm taking an online computer vision class next semester and hope to pick the project back up after learning a bit more.