Hacker News new | ask | show | jobs
by jsweojtj 952 days ago
Related (but not identical), Facebook research just released an open source pdf -> markdown reader (that does a good job w/ equations in latex).

https://facebookresearch.github.io/nougat/

2 comments

I've used it to convert 40 page pdfs into text, and it did an impressive job.
I've had really good results with it so far. I'm using it in the Huggingface Transformers library, and it's been great for my workflow.