Hacker News new | ask | show | jobs
by klft 2336 days ago
(1) For note taking I stumbled across anno[1] via[2] two weeks ago. It's a python flask application which you run on your localhost. You write markdown which gets stored locally as file and is rendered as html using pandoc[3]. It's really basic but I love it.

(2) For physical documents I use a Fujitsu ScanSnap iX500[4] for scanning. A runtime-licencse of ABBYY FineReader for OCR is included. The resulting PDF has embedded text which I extract using pdftotext[5]. I wrote a python application to search and tag this documents. It loads all the text in-memory which is perfecty fine as I have < 10,000 documents. I use it since 5 years and it works OK.

[1] https://github.com/gwgundersen/anno

[2] https://news.ycombinator.com/item?id=22033792

[3] https://pandoc.org/

[4] https://www.fujitsu.com/global/products/computing/peripheral...

[5] https://en.wikipedia.org/wiki/Pdftotext

4 comments

I have a ScanSnap scanner too (mine's an S1500 - I have had it for c10 years or so and it still works perfectly) and it's great to be able to search what used to be paper documents quickly and easily. It saves a lot of physical space as well, most documents I scan then shred immediately once I've verified the scan is good and backed up.

There are some reasonably good OCR tools on Linux now as well - I've been pretty happy with Tesseract[0]. It was an absolute pain to script everything to "just work" when I press the button on my scanner though.

Recoll[1] works very well for indexing documents for me including my OCRd scans. When that's not enough, I revert to pdfgrep.

0. https://github.com/tesseract-ocr/tesseract 1. https://www.lesbonscomptes.com/recoll/

Actually, what has been bugging me recently is the inability to "tag" photos on my iphone - all I want is to snap a copy of my bill / invoice whatever, tag it with "gas bill" and let it upload to icloud / dropbox. from there I am sure I can onwards process looking for "gas bill" but actually there seems to be no obvious way to do it, (even looked into EXIf data), and I guess it will age to wait till i learn ios coding
Touch and hold , then tap an option. Custom: Tap , tap Enter New Tag, type a customtag, and tap Done. Create additional custom tags: Tap , tap Enter New Tag, type a custom tag, and tap Done. Add more than one custom tag to a photo: Tap , and tap each tag you want to add (so a checkmark appears next to it).
Is this a real UX, or something you'd like? (This isn't how either Apple or Google Photos works)
Have you looked into apps like Scanbot [1]

1: https://scanbot.io/en/index.html

Totally unrelated but I love these "how I built my version of" threads - I learn about tech and projects I never knew existed

ok carry on please, diversion over :/)

I've been looking for a good multi-document feed scanner. Do you have experience using the iX500 with Linux, or gscan2pdf?

My usecase would be scanning multi page documents with minimal effort, and saving to PDF somewhere.

I thought about Linux but while it should be possible to use the iX500 with Linux you would lose OCR. I did some tests and compared the OCR of the included ABBYY FineReader with Tesseract[1]. Tesseract was not good enough for my use case. So I still use the iX500 on Windows.

[1] https://en.wikipedia.org/wiki/Tesseract_(software)