Hacker News new | ask | show | jobs
by pedro_hab 1154 days ago
This was pretty bad for me, I tried asking the name of a person references in the PDF and it couldn't find it. I asked who is the claimant in this PDF and it said the claimant was empty.

But if I asked if the claimant name was in the PDF it answered yes.

I am assuming the PDF to Text is not working great here, which I supposed is the whole point.

2 comments

Yup. Having worked on this for a while it’s best to extract the images of the PDF, then send them to Google Vision for extraction.

I have it working with 600 page documents.

yea, same here. I upload some test text. I asked how many children does my coworker have. It said "Your coworker has no children". I said but in the text it says that my coworker has 2 children. The answer was, "You are right, your coworker has 2 children as mentioned on page 2"