|
|
|
|
|
by jerkstate
76 days ago
|
|
after looking into it for a little while, Docling and Marker work pretty well but are very slow. I haven't found anything else that extracts math suitably. It takes 10+ minutes per pdf, so I'm going to run it on a batch of these papers overnight and create my own little gaussian splatting RAG database. It's really too bad PDF is so terrible. |
|