Hacker News new | ask | show | jobs
by dajonker 244 days ago
Try MinerU 2.5 with two-step parsing. It gives good results with bounding boxes per block. Not sure if you can get it to do more detailed such as word or character level.