|
|
|
|
|
by hugodutka
700 days ago
|
|
I used this approach extensively over the past couple of months with GPT-4 and GPT-4o while building https://hotseatai.com. Two things that helped me: 1. Prompt with examples. I included an example image with an example transcription as part of the prompt. This made GPT make fewer mistakes and improved output accuracy. 2. Confidence score. I extracted the embedded text from the PDF and compared the frequency of character triples in the source text and GPT’s output. If there was a significant difference (less than 90% overlap) I would log a warning. This helped detect cases when GPT omitted entire paragraphs of text. |
|
- Request #1 => page_1_image
- Request #2 => page_1_markdown + page_2_image
- Request #3 => page_2_markdown + page_3_image