|
|
|
|
|
by brad0
2766 days ago
|
|
Evernote is an interesting case. They store every word that MAY be in the scanned document. So their OCR engine will find a lot of legitimate words, but it will also find a lot of words that don't sense too. When putting in a term for searching, it looks at the entire index (both legit words and the garbage) and returns you the documents that match. I think it's quite clever. Bear in mind that this feature was many years ago, I have no idea if this is still the case. |
|
Screenshot: https://s24953.pcdn.co/blog/wp-content/uploads/2018/02/longh...
Since it's not aimed for transcription (user doesn't know what he's looking for) but for retrieval (user knows what he's looking for), it can get away with mistakes.
References:
https://evernote.com/blog/how-evernotes-image-recognition-wo...
https://help.evernote.com/hc/en-us/articles/208314518-How-Ev...
https://evernote.com/blog/evernote-indexing-system/