The primary challenge is not just about harnessing AI for search; it's about preparing complex documents of various formats, structures, designs, scans, multi-layout tables, and even poorly captured images for LLM consumption. This is a crucial issue.
To parse PDFs for RAG applications, you'll need tools like LLMwhisperer[1] or unstructured.io[2].
Now back to your problem:
This solution might be an overkill for your requirement, but you can try the following:
To set things up quickly, try Unstract[3], an open-source document processing tool. You can set this up and bring your own LLM models; it also supports local models.
It has a GUI to write prompts to get insights from your documents.[4]
https://tika.apache.org/