| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wahnfrieden 2766 days ago
	This is an Evernote feature. Dropbox also launched this feature.

1 comments

brad0 2766 days ago

Evernote is an interesting case.

They store every word that MAY be in the scanned document.

So their OCR engine will find a lot of legitimate words, but it will also find a lot of words that don't sense too.

When putting in a term for searching, it looks at the entire index (both legit words and the garbage) and returns you the documents that match.

I think it's quite clever.

Bear in mind that this feature was many years ago, I have no idea if this is still the case.

link

ocrcustomserver 2766 days ago

Yeah, Evernote's OCR engine will generate possible candidates for every given word and will sort them internally by confidence score.

Screenshot: https://s24953.pcdn.co/blog/wp-content/uploads/2018/02/longh...

Since it's not aimed for transcription (user doesn't know what he's looking for) but for retrieval (user knows what he's looking for), it can get away with mistakes.

References:

https://evernote.com/blog/how-evernotes-image-recognition-wo...

https://help.evernote.com/hc/en-us/articles/208314518-How-Ev...

https://evernote.com/blog/evernote-indexing-system/

link

julianz 2766 days ago

Yep it's quite clever for searching for things, much less useful for doing something based on the recognized text.

link

ocrcustomserver 2766 days ago

OneNote can do transcription (copy text from image).

link