|
|
|
|
|
by macklinkachorn
496 days ago
|
|
In my previous role, I have experienced similar things where the rule-based parsing approach is really tricky to get right and often failed via from edge cases. We (at https://runtrellis.com/) have been building PDF processing pipeline from the ground up with LLMs and VLMs and have seen close to 100% accuracy even for tricky PDFs. The key is to use rule based engine and references to cross check the data. |
|