|
|
|
|
|
by codetrotter
499 days ago
|
|
Many moons ago I was tasked with extracting data from a bunch of PDFs. I made a tool to visualise how characters were laid out on the page and bounding boxes of all the elements. The project was in the end a complete failure and several people were upset at me for not delivering what I was supposed to. In present day, with the capabilities that are now available with LLMs to extract data from PDFs I 100% would go the route of utilising AI to extract the data they wanted. Back then that did not yet exist. |
|
OCR can take you pretty far depending on expectations, but it's never quite far enough in my experience.