Hacker News new | ask | show | jobs
by tastyminerals 2332 days ago
It's a nice overview article for anyone interested in the topic of IE from financial documents. However, for industrial level solutions Tesseract does not cut it. Abbyy is the best OCR engine on the market currently. Receipt IE works just fine with rules supported by a small BiLSTM as fallback just because receipts do not contain a lot of text. With invoices this approach is suboptimal. On a general note no DL approach will give you fast and high enough results just because any advanced network would be too slow and too generic. If extraction takes more than 5 sec. it is hard to sell such system.
3 comments

> On a general note no DL approach will give you fast and high enough results just because any advanced network would be too slow and too generic.

Yeah, but see, I have this shiny hammer, and if I squint just right, everything looks like a nail!

The seasonal trends in the programmer world get kinda tiring after a while, and the people peddling the latest hot new thing equally so, such as this thinly veiled promo piece for nanonets.

In science, there's this concept of falsifiability. If a theory can't be disproven, it's automatically false. The same goes for technological evangelism. If no-one knows what a piece of tech is bad at, how the hell would you know if it's any good at anything, really? There are no panaceas, no one-tool-to-rule-them-all, no single piece of tech that will usher in a new golden age for programmerkind. They're just tools in a toolbox. Know what each tool is good at. Know what each tool is bad at. Don't forget your old tools, just because the newest tool is still very shiny.

Platitudes, I know, but still.

I had abbyy a long time ago and had some uses for it so i went to the website to check out the cost and was met with this monstrosity "Protect your shopping cart downloads with Download Insurance Service. For only £11.00 you will be able to download your files for 24 months, in case you need to reinstall the products. ABBYY Screenshot Reader". Ripoff!!, I've never seen anything like it elsewhere, has anybody else?
it is the market leader and monopolist, this is what you get.
Wow Abbyy is 26 years old. Nice to see software that lasts that sort of time.
I remember playing with their OCR (FineReader, I think it was called?) when it came as part of shareware bundles on magazine CDs in the mid-90s...