Hacker News new | ask | show | jobs
by srajabi 1277 days ago
Great site!

I feel like it's missing some things though:

* Data Entry -- OCR and the like

* Data Retrieval -- Don't search engines still qualify as AI

* Sorting mail?

* Other factory use cases like removing undesirable tomatoes: https://www.youtube.com/watch?v=aYQ_5c6m8Is

* Many others I'm not thinking of...

3 comments

Is OCR that good yet? From what I've seen, it's good if you have uniform text in a single standard-looking font. When layout or font varies widely, or there's extraneous stuff on the page (for instance: headers, footers, page numbers, marginal notes, same-page footnotes, low quality source material with marks on it, or scan artifacts), quality degrades. Good OCR engines I've seen can still OCR all the things and present them in a somewhat readable text format, but general intelligence (human or AGI) allows quick, automatic recognition of different sections of text that a narrow OCR AI struggles with. A human or AGI knows this text and that text are both blockquotes, or marginal notes, and instinctively attaches semantic meaning to each area, font, style, color encountered. An OCR engine struggles to get beyond blocks of text each with their own margins and no semantic meaning attached, leading to markup hell.

To highlight the limitations, look at an OCR'd version of a technical book with code samples and different fonts and styles that have different meanings, and that has both footnotes and endnotes. The text will be readable, but disorganized, probably inconsistent styling, and even if some footnotes and endnotes are linked by a good engine, I suspect that's less than fully reliable. For the purposes of reading the book, I'd rather have the scanned pdf with page images for reading, with the OCR'd text as the text layer for searching.

Lower-quality source images seem to cause major problems for tesseract, and even ABBYY judging from archive.org text conversions. Those engines confuse more ambiguous letter or punctuation combinations, while humans can still read the images without much trouble.

Thanks! It's definitely a short list at this point. I had a few more items in mind but was too eager to share it. I will add some of your suggestions in the coming days.
I feel that AI can do away with 90% of the KPO industry. Most of that is just grunt work being done by a lowly paid human halfway across the world.