For use in retrieval/RAG, an emerging paradigm is to not parse the PDF at all.
By using a multi-modal foundation model, you convert visual representations ("screenshots") of the pdf directly into searchable vector representations.
Claude.ai handles tables very well, at least in my tests. It could easily convert a table from a financial document into a markdown table, among other things.
1. Render first 2 pages of PDF into a JPEG offline in the Mac app.
2. Upload JPEG to ChatGPT Vision and ask what would be a good file name for this.
It works surprisingly well.