|
|
|
|
|
by thebouv
2803 days ago
|
|
My first thought when reading this was it seemed almost over-engineered compared to just using Tika+Tesseract. I'm not sure what benefit they are getting from using machine learning for this other than "decide whether to try and process this file or not". Tika + Tesseract seems to be able to do the heavy lifting they spent a lot of time talking about in that article. |
|