Hacker News new | ask | show | jobs
by mvdwoord 43 days ago
"everyone hates PDFs where you can't reliably select and copy text!"

Boy do I. One of my biggest annoyances is receiving an invoice in pdf format, where I can either not select the text at all, or where you cannot cleanly select text, i.e. when you try to select something it somehow half highlights the line above as well and I am not sure what is on my clipboard, and need to paste temporarily in a text editor, then select what I need ... etc

Super nice when the list IBAN numbers for payment in a tiny font size as well.

Maybe I should vibecode a little helper. tool to visually select a rectangle and perform OCR and detect IBAN numbers or show a popup with proper text to do my subselect.

4 comments

Personal lifehack: Use the address bar of your browser to view the clipboard content quickly and to omit any formatting from it.
This also conveniently sends it to your search provider, and possibly to the browser vendor for analytics.
These days it’s hard to be sure of anywhere you can paste a piece of text and be certain it’s not being sent to a server somewhere.
Depends on your settings.
We have a couple of large customers who will only send remittance advices as a PDF, the are several pages and a couple of hundred rows. Apparently their system can not send XLSX or any other format.

I've been a happy user of Tabula[1] for a few years and it works really well, for my needs anyway.

I just import, auto-detect tables, select "Stream", and then export to a CSV.

[1] https://tabula.technology/ [1] https://github.com/tabulapdf/tabula

> select a rectangle and perform OCR

You get that out of the box on an up to date KDE, from the screenshoting application Spectacle.

For the mac there's TextSniper which does just that.
There’s a Power Toys util for windows that does the same - draw a rectangle around the text you want, and it OCRs it to your clipboard.
On mac, I just do a quick screenshot and use the builtin OCR in Preview to select and copy text all the time.
Textgrabber does that without needing a screenshot. Found it here on HN, never looked back.
> Textgrabber does that without needing a screenshot.

There seems to be a product or two using that name: can you give a URL to the legitimate one?