|
|
|
|
|
by constantinum
641 days ago
|
|
There is also LLMWhisperer, a document pre-processor specifically made for LLM consumption. As other mentioned, accuracy is the one part of solution criteria, other include, how does the preprocessing engine scale/performs at large scale, and how does it handle very complex documents like, bank loan forms with checkboxes, IRS tax forms with multi-layered nested tables etc. https://unstract.com/llmwhisperer/ LLMWhisperer is a part of Unstract - An open-source tool for unstructured document ETL. https://github.com/Zipstack/unstract |
|