Hacker News new | ask | show | jobs
by fzysingularity 637 days ago
Re: obscure PDFs, I’d love to see a PDF dataset with a whole bunch of these from different domains.

I think in general it’s very hard to say if any approach is “good enough” until you see some serious degree of variability in the input domain.