Hacker News new | ask | show | jobs
by wahnfrieden 2766 days ago
This is an Evernote feature. Dropbox also launched this feature.
1 comments

Evernote is an interesting case.

They store every word that MAY be in the scanned document.

So their OCR engine will find a lot of legitimate words, but it will also find a lot of words that don't sense too.

When putting in a term for searching, it looks at the entire index (both legit words and the garbage) and returns you the documents that match.

I think it's quite clever.

Bear in mind that this feature was many years ago, I have no idea if this is still the case.

Yeah, Evernote's OCR engine will generate possible candidates for every given word and will sort them internally by confidence score.

Screenshot: https://s24953.pcdn.co/blog/wp-content/uploads/2018/02/longh...

Since it's not aimed for transcription (user doesn't know what he's looking for) but for retrieval (user knows what he's looking for), it can get away with mistakes.

References:

https://evernote.com/blog/how-evernotes-image-recognition-wo...

https://help.evernote.com/hc/en-us/articles/208314518-How-Ev...

https://evernote.com/blog/evernote-indexing-system/

Yep it's quite clever for searching for things, much less useful for doing something based on the recognized text.
OneNote can do transcription (copy text from image).