Hacker News new | ask | show | jobs
by theoryofx 456 days ago
Yeah, I expect it'd mostly be useful for OCR and search. These are hard to read PDF files and there's a lot of them.

I found a few projects related to using AI with The JFK Files but they all seem old or uninteresting. Which is why I'm asking here.

1 comments

Some prior discussion prompted by "Why LLMs Suck at OCR": https://news.ycombinator.com/item?id=42966958
I've tested Gemini 2.0 Flash on a bunch of the JFK Files PDFs and it's excellent.

Even with extremely blurry typewriter scans that are difficult for me to decipher.

It's incredible.

I'm sure there's cases where it will fail but just OCRing 90% of the files would be a big win.