Hacker News new | ask | show | jobs
by jedbrooke 599 days ago
> Are there other solutions out there?

yes you can just do cmd+shift+4 to take a screen shot, then open the screenshot in the popup that appears and MacOS will automatically OCR it (orc button in the bottom right). This is a built in functionality in MacOS

6 comments

Interestingly the macOS one is not very accurate. I took a screenshot of your comment and macOS OCR read the "cmd+shift+4" as "cod+shift+4".
The thing is, if the linked app is using Apple's Vision API, it will perform the same.
Good point. From the list of supported languages [1] it looks like it is in fact using the Vision API in fast mode (as accurate mode seems to support more languages).

[1] https://www.textcapture.app/#faq

I wonder why that is? Could it mean that Apple trained their ocr tool to favor nontechnical text. Meaning the tool determined that “cod” was more likely than “cmd”

Interestingly, iOS corrected “cmd” to “cod” when I first typed it out.

It correctly OCR'd it for me.
I disabled that function because it gives the false illusion that docs and images can be saved with text and then will be indexable and searchable in the Finder and other apps; they are not. When I open a PDF, I need to know that it has native text actually saved in the file. If it doesn't, then I will OCR it so it is for sure indexable and searchable.
I have been using this one for quite a while, it works well for me:

https://github.com/schappim/macOCR

(I'd say my number one use is snagging urls out of Zoom presentations, quicker and easier than a screenshot)

Agreed. But I do wonder if this product provides a better enough UX to be worth it’s current price. In my case, it doesn’t support the languages I use so I’ll be sticking with the default Mac feature.
I've been doing this for a while and find that the OCR performance is fantastic.
Works for images in Preview and even in Safari too. Super handy.
You can even search for text in images in Safari. I was dumbfounded the first time I searched for some text in a page and Safari found it in an image on the page.
Works in Photos.app for searching for text in your photo albums too.

macOS OCR behavior extends to most similar things in iOS too.

Which makes Photos.app a surprisingly good recipe book.
Also as a rolodex. I just take pictures of business cards and you can long press to OCR the phone number and dial from that immediately with no need to even create contact entries unless it becomes a repeat relationship, and if you do, you can usually insta-create the contact card in full with just a long press on the image.
The moment I realized this was now a table-stakes feature for a GUI OS, for me, was when I’d been reading and copy-pasting from an image for a couple minutes before realizing it wasn’t a PDF.