Hacker News new | ask | show | jobs
by rayshan 986 days ago
Genuine question: for simple needs, why use this or DevonThink over macOS' built-in features? macOS now does OCR (Live Text), has tagging, and spotlight search is fast (but sometimes presents too many results to be useful). I even stopped splitting PDFs into separate documents and organizing them into folders. I just search.
5 comments

Does auto OCR work on iCloud files ? For example: I scansnap a huge collection of documents to a folder that is on iCloud (synced w desktop). It works great because it is so simple. However if I have, say, PDF document, will the Mac ocr functionality perform the OCR if the doc is on iCloud and will I then be able to search for the text in that doc via spotlight / finder ? I tested this a few years ago and the search on content inside scanned PDFs did not work. I had looked at Paperless but decided to stay on Mac os file system.
Are you talking about iCloud Drive? As far as I can tell, files in there are just normal files, so Live Text works. You can easily put a screenshot / pdf in there and see.
This is more designed for a self hosted server, so if you want multi-device web access then it's a great solution. I can download a PDF on my android phone and upload it to my paperless-ngx instance in a couple of clicks and easily edit the tags as necessary. It's great for travelling as you're not reliant on having a locally installed application on your chosen device with you, and of course it would still be available if you lost your main device and only had your phone on you.
Makes sense, but how about Dropbox / iCloud Drive as alternatives? PDFs / images are somewhat small (at least relative to videos). I just stuff all my PDFs in Dropbox. I'm almost completely paperless and I don't seem to accumulate that many scanned docs to fill up even the free tier storage space.
Yeah, it depends on what you want from a document management system. If you've got a bunch of searchable PDFs, then storing them in a cloud service might well be sufficient. Paperless-NGX adds OCR to the mix (probably more useful for scanned paper images) and also tags. When I add a document, it fills in the best guess for correspondent, document type and appropriate tags, which tends to be accurate for common documents (e.g. payslips, statements) and usually only needs me to change them if it's from someone new.

What I find most useful is grouping together holiday documents such as travel insurance, holiday booking details, passports etc. and assigning a suitable tag, so I can easily find the relevant info. You could easily replicate that by copying those documents into a separate folder for easy access, but with Paperless-NGX it does most of the organisation for you and the search is more flexible as you can specify what kind of document you're looking for and who it came from.

I used to be the target audience and really enjoyed having my system just right, sorting and tagging everything, etc. But over the years I realized that I wasn’t really benefiting much, and gave SwiftScan on my iPhone + dumping into and iCloud folder a try. For my needs, this has worked fine. It is rare I even need to refer to the scans, and the macOS OCR + automatic dates usually let me find the doc quickly. In the worst case I browse thumbnails.
Yeah. I had a Devonthink-based setup but after one too many database corruptions I threw in the towel. Now I just OCR scan everything into a few MacOS folders and search using Houdahspot (Spotlight, I found, was not suitable for fine-grained search). I’m very happy with the setup.
HoudahSpot looks cool! What kind of query do you use it for that Spotlight can't find for you?
Obvious answer: because, contrary to popular belief, not everyone uses macOS.