|
|
|
|
|
by h-jones
672 days ago
|
|
Anyone know how this compares to GROBID [1]? I'm looking at alternatives to GROBID as I'm not super pleased with its outputs. GROBID has a lot of great features for journal papers (reference extraction / parsing), but I'm only interested in cleanly extracting the body. Also considering nougat [2] but I haven't tried it yet. [1] https://github.com/kermitt2/grobid [2] https://github.com/facebookresearch/nougat |
|
[0] https://pdf2htmlex.github.io/pdf2htmlEX/ [1] https://www.brow.sh/