Hacker News new | ask | show | jobs
by hextex 597 days ago
Facebook's Nougat [1] should work with this, but not sure how much preprocessing is needed to yield good results with scanned copies of physical documents. Note that it outputs .mmd files (MultiMarkDown), but the equations and tables should (iirc) output plain LaTeX.

1: https://github.com/facebookresearch/nougat

1 comments

This looks really interesting! I will definitely have a look.