|
|
|
|
|
by thor-rodrigues
327 days ago
|
|
I spent a good amount of time last year working on a system to analyse patent documents. Patents are difficult as they can include anything from abstract diagrams, chemical formulas, to mathematical equations, so it tends to be really tricky to prepare the data in a way that later can be used by an LLM. The simplest approach I found was to “take a picture” of each page of the document, and ask for an LLM to generate a JSON explaining the content (plus some other metadata such as page number, number of visual elements, and so on) If any complicated image is present, simply ask for the model to describe it. Once that is done, you have a JSON file that can be embedded into your vector store of choice. I can’t say about the price-to-performance ration, but this approach seems to easier and more efficient than what is the author is proposing. |
|